data.table hashTagR ggplot2 lubridate readr rtweet tidyverse
TRUE TRUE TRUE TRUE TRUE TRUE TRUE
tidytext wordcloud
TRUE TRUE
savingSessions: tweet analysis
1 Background
UK demand response experiments by NG-ESO and retailers such as @OctopusEnergy
Attempt to do some analysis of #savingSession(s) tweets.
2 Code setup
Part of https://github.com/dataknut/savingSessions
Makes use of https://github.com/dataknut/hashTagR, a DIY wrapper for the rtweet rstats package.
3 Getting data
Grab the most recent set of tweets that mention #savingSession OR #savingSessions OR #savingsession using the rtweet::search_tweet() function and merge with any we may already have downloaded.
Note that tweets do not seem to be available after ~ 14 days via the API used by rtweet. Best to keep refreshing the data every week…
[1] "Found 54 files matching *.csv in ~/Dropbox/data/twitter/savingSessions/"
That produced a data file of 3229 tweets.
We do NOT store the tweets in the repo for both ethical and practical reasons…
Note also that we may not be collecting the complete dataset of hashtagged tweets due to the intricacies of the twitter API.
4 Analysis
4.1 Tweet time line
Figure 1 shows the timing of tweets by hour.
Figure 2 shows cumulative tweets by hour.
We see roughly the kind of uptick in tweets for Session 2 that we saw for Session 1…
4.2 Content analysis
Let’s try a word cloud.
Inspiration here: https://towardsdatascience.com/create-a-word-cloud-with-r-bde3e7422e8a
Make a word cloud for all tweets
These are unlikely to render the word ‘savingsession’ as it will be in all tweets due to the twitter search pattern used.
We need to remove common words (to, the, and, a, for, etc). These are called ‘stop words’.
Not especially informative…
5 Sentiment analysis
Inspired by https://www.tidytextmining.com/sentiment.html
Take those cleaned words and sentiment them!
The first word cloud are names that have negative sentiment (according to tidytext::get_sentiments("bing")). Remember the size of the words is relative to the count of other negative words.
negative positive
286 236
# A tibble: 2 × 2
sentiment freq
<chr> <int>
1 negative 951
2 positive 1958
The second wordcloud shows positive sentiments for all tweets. Remember the size of the words is relative to the count of other positive words.
5.1 Session 1 sentiment word clouds
Repeat these word clouds but just for the first session which was on 2022-11-15.
These are just the tweets for the day of the event and the day after…
NULL
NULL
5.2 Session 2 word clouds
Repeat for session 2 which was on 2022-11-22.
These are just the tweets for the day of the event and the day after…
NULL
NULL
5.3 Session 3 word clouds
Repeat for session 3 which which was on 2022-11-30.
These are just the tweets for the day of the event and the day after…
NULL
NULL
5.4 Session 4 word clouds
Repeat for session 4 which was on 2022-12-01.
These are just the tweets for the day of the event and the day after…
NULL
NULL
5.5 Session 5 word clouds
Repeat for session 5 which was on 2022-12-12.
These are just the tweets for the day of the event and the day after…
NULL
NULL